FINAL PROJECT: Utility Rates and Unemployment

Link to repo: https://github.com/sarstanc/DV_FinalProject

Load csv files into Oracle

alt text alt text
Select the csv file to import
alt text
Check that left and right enclosures do not conflict with data
alt text
Name the table
alt text
Choose all columns that go into the Oracle table
alt text
Change column names to remove spaces and starting numbers

Load data into Tableau

alt text alt text
Load each data set one at a time to blend rather than join

Data blending in Tableau

alt text
Edit relationships between data sets to link common columns
alt text
Existing relationships will appear here. Select “custom” to change.
alt text
Use Population by zipcode as the primary dataset and Unemployment and Utilities as the secondary datasets.
alt text
alt text
Orange indicates the secondary data sources

Creating a calculated field

alt text
Resrate, Indrate, and Comrate are all dimensions from the Utility Rates data source. Since these dimensions contain string values, they must individually be converted to integers in order to calculate an average.

Creating visualizations in Tableau

alt text
Crosstabs with utility company name and unemployment rate by zip code alt text Generate longitude and latitude to show residential utility rates compared to population by zip code alt text
Filter out null residential utility costs alt text
Bar graph shows average utility cost broken down into residential, commercial, and industrial costs per zip code alt text
Filter out null and 0 value utility costs

Creating visualizations in R

library("rjson")
library("RCurl")
## Loading required package: bitops
require(tidyr)
## Loading required package: tidyr
require(dplyr) 
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
source("../04_R/unemployment_rank.R")

3 part join

source("../04_R/avgrate.R")
## Loading required package: ggplot2

Multiplied by -1 to workaround for calculated field